2 research outputs found

    dispel4py: A Python framework for data-intensive scientific computing

    Get PDF
    This paper presents dispel4py, a new Python framework for describing abstract stream-based workflows for distributed data-intensive applications. These combine the familiarity of Python programming with the scalability of workflows. Data streaming is used to gain performance, rapid prototyping and applicability to live observations. dispel4py enables scientists to focus on their scientific goals, avoiding distracting details and retaining flexibility over the computing infrastructure they use. The implementation, therefore, has to map dispel4py abstract workflows optimally onto target platforms chosen dynamically. We present four dispel4py mappings: Apache Storm, message-passing interface (MPI), multi-threading and sequential, showing two major benefits: a) smooth transitions from local development on a laptop to scalable execution for production work, and b) scalable enactment on significantly different distributed computing infrastructures. Three application domains are reported and measurements on multiple infrastructures show the optimisations achieved; they have provided demanding real applications and helped us develop effective training. The dispel4py.org is an open-source project to which we invite participation. The effective mapping of dispel4py onto multiple target infrastructures demonstrates exploitation of data-intensive and high-performance computing (HPC) architectures and consistent scalability.</p

    Dispel4py:A python framework for data-intensive scientific computing

    No full text
    This paper presents dispel4py, a new Python framework for describing abstract stream-based workflows for distributed data-intensive applications. The main aim of dispel4py is to enable scientists to focus on their computation instead of being distracted by details of the computing infrastructure they use. Therefore, special care has been taken to provide dispel4py with the ability to map abstract workflows to different enactment platforms dynamically, at run time. In this work we present four dispel4py mappings: Apache Storm, MPI, multi-threading and sequential. The results show that dispel4py is successful in enacting on different platforms, while also providing scalable performance.</p
    corecore